646 research outputs found
Modeling concept drift: A probabilistic graphical model based approach
An often used approach for detecting and adapting to concept drift when doing classi cation is to treat the data as i.i.d. and use changes in classi cation accuracy as an indication of concept drift. In this paper, we take a different perspective and propose a framework, based on probabilistic graphical models, that explicitly represents concept drift using latent variables. To ensure effcient inference and learning, we resort to a variational Bayes inference scheme. As a proof of concept, we demonstrate and analyze the proposed framework using synthetic data sets as well as a real fi nancial data set from a Spanish bank
Dynamic Bayesian belief network to model the development of walking and cycling schemes
This paper aims to describe a model which represents the formulation of decision-making processes (over a number of years) affecting the step-changes of walking and cycling (WaC) schemes. These processes can be seen as being driven by a number of causal factors, many of which are associated with the attitudes of a variety of factors, in terms of both determining whether any scheme will be implemented and, if it is implemented, the extent to which it is used. The outputs of the model are pathways as to how the future might unfold (in terms of a number of future time steps) with respect to specific pedestrian and cyclist schemes. The transitions of the decision making processes are formulated using a qualitative simulation method, which describes the step-changes of the WaC scheme development. In this article a Bayesian belief network (BBN) theory is extended to model the influence between and within factors in the dynamic decision making process
Corrected score methods for estimating Bayesian networks with error-prone nodes
Motivated by inferring cellular signaling networks using noisy flow cytometry
data, we develop procedures to draw inference for Bayesian networks based on
error-prone data. Two methods for inferring causal relationships between nodes
in a network are proposed based on penalized estimation methods that account
for measurement error and encourage sparsity. We discuss consistency of the
proposed network estimators and develop an approach for selecting the tuning
parameter in the penalized estimation methods. Empirical studies are carried
out to compare the proposed methods and a naive method that ignores measurement
error with applications to synthetic data and to single cell flow cytometry
data
Quality of Service in IEEE 802.11ac and 802.11n Wireless Protocols with Applications in Medical Environments
Wireless computer networks are increasingly important as reliable
means of communication in medical environments. Evaluation of Quality of
Service (QoS) in wireless computer networks deployed in medical environments
can improve network performance and enhance utilization of resources. In this
study, the QoS offered by IEEE 802.11n and IEEE 802.11ac wireless protocols
was evaluated and compared using multiple point-to-point links for Voice Over
Internet Protocol (VoIP) traffic. QoS was evaluated based on Predictive Statistical
Diagnosis (PSD) and Probabilistic Neural Network (PNN). PSD and PNN based
QoS evaluation methods categorized the VoIP packets into low, medium and high
QoS types based on the packets' transmission delay, jitter, and percentage packet
loss ratio. Both PSD and PNN allowed QoS for VoIP to be quantified accurately.
It was shown that 802.11ac provides a higher QoS for VoIP transmission as
compared with IEEE 802.11n. The devised methods can be used in medical
environments for evaluation of wireless networks' QoS
Converting simulated total dry matter to fresh marketable yield for field vegetables at a range of nitrogen supply levels
Simultaneous analysis of economic and environmental performance of horticultural crop production requires qualified assumptions on the effect of management options, and particularly of nitrogen (N) fertilisation, on the net returns of the farm. Dynamic soil-plant-environment simulation models for agro-ecosystems are frequently applied to predict crop yield, generally as dry matter per area, and the environmental impact of production. Economic analysis requires conversion of yields to fresh marketable weight, which is not easy to calculate for vegetables, since different species have different properties and special market requirements. Furthermore, the marketable part of many vegetables is dependent on N availability during growth, which may lead to complete crop failure under sub-optimal N supply in tightly calculated N fertiliser regimes or low-input systems. In this paper we present two methods for converting simulated total dry matter to marketable fresh matter yield for various vegetables and European growth conditions, taking into consideration the effect of N supply: (i) a regression based function for vegetables sold as bulk or bunching ware and (ii) a population approach for piecewise sold row crops. For both methods, to be used in the context of a dynamic simulation model, parameter values were compiled from a literature survey. Implemented in such a model, both algorithms were tested against experimental field data, yielding an Index of Agreement of 0.80 for the regression strategy and 0.90 for the population strategy. Furthermore, the population strategy was capable of reflecting rather well the effect of crop spacing on yield and the effect of N supply on product grading
Molecular cloning and expression profiling of a chalcone synthase gene from hairy root cultures of Scutellaria viscidula Bunge
A cDNA encoding chalcone synthase (CHS), the key enzyme in flavonoid biosynthesis, was isolated from hairy root cultures of Scutellaria viscidula Bunge by rapid amplification of cDNA ends (RACE). The full-length cDNA of S. viscidula CHS, designated as Svchs (GenBank accession no. EU386767), was 1649 bp with a 1170 bp open reading frame (ORF) that corresponded to a deduced protein of 390 amino acid residues, a calculated molecular mass of 42.56 kDa and a theoretical isoelectric point (pI) of 5.79. Multiple sequence alignments showed that SvCHS shared high homology with CHS from other plants. Functional analysis in silico indicated that SvCHS was a hydrophilic protein most likely associated with intermediate metabolism. The active sites of the malonyl-CoA binding motif, coumaroyl pocket and cyclization pocket in CHS of Medicago sativa were also found in SvCHS. Molecular modeling indicated that the secondary structure of SvCHS contained mainly α-helixes and random coils. Phylogenetic analysis showed that SvCHS was most closely related to CHS from Scutellaria baicalensis. In agreement with its function as an elicitor-responsive gene, the expression of Svchs was induced and coordinated by methyl jasmonate. To our knowledge, this is the first report to describe the isolation and expression of a gene from S. viscidula
Learning genetic epistasis using Bayesian network scoring criteria
<p>Abstract</p> <p>Background</p> <p>Gene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machine-learning and data mining methods have been developed for learning epistatic relationships from data. A well-known combinatorial method that has been successfully applied for detecting epistasis is <it>Multifactor Dimensionality Reduction </it>(MDR). Jiang et al. created a combinatorial epistasis learning method called <it>BNMBL </it>to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2-SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL.</p> <p>Although BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model.</p> <p>Results</p> <p>We evaluated the performance of 22 BN scoring criteria using 28,000 simulated data sets and a real Alzheimer's GWAS data set. Our results were surprising in that the Bayesian scoring criterion with large values of a hyperparameter called α performed best. This score performed better than other BN scoring criteria and MDR at <it>recall </it>using simulated data sets, at detecting the hardest-to-detect models using simulated data sets, and at substantiating previous results using the real Alzheimer's data set.</p> <p>Conclusions</p> <p>We conclude that representing epistatic interactions using BN models and scoring them using a BN scoring criterion holds promise for identifying epistatic genetic variants in data. In particular, the Bayesian scoring criterion with large values of a hyperparameter α appears more promising than a number of alternatives.</p
Constructing Biological Pathways by a Two-Step Counting Approach
Networks are widely used in biology to represent the relationships between genes
and gene functions. In Boolean biological models, it is mainly assumed that
there are two states to represent a gene: on-state and off-state. It is
typically assumed that the relationship between two genes can be characterized
by two kinds of pairwise relationships: similarity and prerequisite. Many
approaches have been proposed in the literature to reconstruct biological
relationships. In this article, we propose a two-step method to reconstruct the
biological pathway when the binary array data have measurement error. For a pair
of genes in a sample, the first step of this approach is to assign counting
numbers for every relationship and select the relationship with counting number
greater than a threshold. The second step is to calculate the asymptotic
p-values for hypotheses of possible relationships and select relationships with
a large p-value. This new method has the advantages of easy calculation for the
counting numbers and simple closed forms for the p-value. The simulation study
and real data example show that the two-step counting method can accurately
reconstruct the biological pathway and outperform the existing methods. Compared
with the other existing methods, this two-step method can provide a more
accurate and efficient alternative approach for reconstructing the biological
network
- …